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Abstract —We propose a subspace constrained precoding 
scheme that exploits the spatial channel correlation structure in 
massive MIMO cellular systems to fully unleash the tremendous 
gain provided by massive antenna array with reduced channel 
state information (CSI) signaling overhead. The MIMO precoder 
at each base station (BS) is partitioned into an inner precoder 
and a Transmit (Tx) subspace control matrix. The inner precoder 
is adaptive to the local CSI at each BS for spatial multiplexing 
gain. The Tx subspace control is adaptive to the channel statistics 
for inter-cell interference mitigation and Quality of Service 
(QoS) optimization. Specifically, the Tx subspace control is 
formulated as a QoS optimization problem which involves an 
SINK chance constraint where the probability of each user’s 
SINR not satisfying a service requirement must not exceed a given 
outage probability. Such chance constraint cannot be handled by 
the existing methods due to the two-stage precoding structure. To 
tackle this, we propose a bi-convex approximation approach, which 
consists of three key ingredients: random matrix theory, chance 
constrained optimization and semidefinite relaxation. Then we 
propose an efficient algorithm to find the optimaf sofution 
of the resuiting bi-convex approximation probiem. Simulations 
show that the proposed design has significant gain over various 
basefines. 

Index Terms —Massive MIMO, Subspace constrained precod¬ 
ing, QoS Guarantees 

I. Introduction 

Massive MIMO has been regarded as one of the key 
technologies in future wireless networks. In massive MIMO 
cellular systems, each BS is equipped with M> 1 antennas. 
This large spatial degree of freedom (DoF) of massive MIMO 
systems can be exploited to significantly increase the spectrum 
and energy efficiency (TJ. Specifically, there are two important 
roles for the spatial DoF introduced by the massive MIMO, 
namely the intra-cell spatial multiplexing and the inter-cell 
interference mitigation. However, there are several practical 
issues towards achieving the huge performance gain predicted 
by the theoretical analysis in massive MIMO systems. First, 
to realize the spatial multiplexing gains (combating intra¬ 
cell interference) within each BS using conventional MU- 
MIMO precoding Q, [3j, real-time CSIT (i.e, channel state 
information at the BS) is required. In most of the existing 
works, Time-Division Duplex (TDD) is assumed and channel 
reciprocity can be exploited to obtain CSIT via uplink pilot 
training. However, in this paper, we are focusing on a more 
challenging but important Frequency-Division Duplex (FDD) 
mode of Massive MIMO. In this case, channel reciprocity 
cannot work and the channel estimation has to be obtained via 
downlink channel estimation and channel feedback, which is 
practically infeasible because the number of independent pilot 
symbols available for channel estimation is fundamentally 


limited by the channel coherence time and it may become 
much smaller than M as M grows large. Second, to realize 
the inter-cell interference mitigation gains in massive MIMO 
using coordinated beamforming 0 or cooperative MIMO |5j, 
real-time global CSIT knowledge is required for cooperative 
MIMO and at least partial global CSIT (i.e., each BS needs 
to know the channels from this BS to all users) is required 
for coordinated beamforming. However, the acquisition of 
(partial) global CSIT is a very challenging problem in practical 
massive MIMO systems because both the downlink pilot 
training and the uplink CSI feedback overhead will become 
unacceptable as the number of antennas M grows large. It is 
even more difficult to obtain real-time (partial) global CSIT 
due to the backhaul signaling latency. 

In this paper, we propose a two stage subspace constrained 
precoding for massive MIMO cellular systems. In practice, 
due to local scattering effects |6j, 0, the channel vectors 
of users are usually concentrated in a subspace with a much 
smaller dimension than M when M is large. The proposed 
two-stage precoding takes advantage of this limited scattering 
property to achieve intra-cell spatial multiplexing and inter- 
cell interference mitigation without all the aforementioned 
practical issues. Specifically, the MIMO precoder at each BS 
is partitioned into an inner precoder and a semi-unitary Tx 
subspace control matrix. The inner precoder is adaptive to real¬ 
time local CSIT per BS for intra-cell spatial multiplexing gain. 
The Tx subspace control matrix is adaptive to long-term chan¬ 
nel statistics to achieve the best tradeoff between direct link 
diversity gain and cross link (inter-cell) interference mitigation 
such that the QoS of the users is maximized. Such subspace 
constrained precoding structure simultaneously resolves the 
aforementioned practical challenges. For instance, the issue of 
insufficient pilot symbols for real-time local CSI estimation is 
resolved because the BS only needs to estimate the CSI within 
the subspace determined by the Tx subspace control, which 
is of a much smaller dimension than M. Furthermore, the Tx 
subspace control is adaptive to the long-term channel statistics, 
which is insensitive to the backhaul latency. As a result, the 
proposed subspace constrained precoding fully exploits the 
large number of antennas to simultaneously achieve intra-cell 
spatial multiplexing per BS and inter-cell interference mitiga¬ 
tion without an expensive backhaul signaling requirement. 

In®, a similar two-stage precoding structure was proposed 
for single cell massive MIMO systems, where a simple pre¬ 
beamforming (with a pre-determined dimension) is used to 
control intra-cell interference and achieve spatial multiplex¬ 
ing gain based on block diagonalization (BD). However, 
the solution in |8| for single cell systems is fundamentally 



Proposed two stage precoding 

Two stage precoding in |8 | 

System topology 

Multi-cell massive MIMO system 

Single cell massive MIMO - system 

Roles of two-stage precoding 

Intra-cell & Inter-cell interference mitigation 

Intra-cell interference mitigation 

Subspace dimension partitioning 

Dynamic 

Static 

Adaptive to heterogeneous path loss 
and QoS requirements explicitly 

Yes 

No 

Design approach 

Non-trivial design based on optimization 

Simple design based on BD method 

Performance 

Good 

Moderate 


Table I: Summary of the differences compared with existing (state-of-the-art) two stage precoding scheme in |8 


different from the proposed solution for multi-cell systems 
with heterogeneous path loss and heterogeneous QoS require¬ 
ments, as summarized in Table III For example, the BD- 
based pre-beamforming design in f§J does not explicitly take 
into account the heterogeneous path loss and QoS require¬ 
ments and thus may be far from optimal as shown in the 
simulations. Moreover, the pre-beamforming design in [0 
is based on a heuristic (BD) method and the dimensions 
of the pre-beamforming matrices for different user clusters 
are pre-determined and fixed. In this paper, the design of 
Tx subspace control is formulated as a QoS optimization 
problem with individual QoS requirements and the dimen¬ 
sions of Tx subspace control matrices are also included in 
the optimization. As shown in the simulations, in multi-cell 
systems with heterogeneous path loss and QoS requirements, 
it is very important to dynamically allocate the dimensions of 
the Tx subspace control matrices over different user clusters. 
With optimal dynamic subspace dimension partitioning, the 
proposed solution can achieve a much better performance than 
the two-stage precoding in |0 with static subspace dimension 
partition. However, the optimization based solution in this 
paper also causes several non-trivial technical challenges. 

• SINR Chance Constraint under Two-Stage Precoding: 

The QoS optimization problem involves an SINR chance 
constraint where the probability of each user’s SINR not 
satisfying a service requirement must not exceed a given 
outage probability. Such SINR chance constraint does not 
have a closed form expression. There are various Bern¬ 
stein techniques to derive a safe convex approximation 
for the chance constraints of standard forms 0. ED- 
However, due to the two-stage precoding structure, the 
associated SINR chance constraint does not belong to 
any of the standard forms. 

• Combinatorial Nature of Dimension Partitioning: Due 

to the combinatorial nature, the optimization of subspace 
dimension partitioning is very challenging and brute force 
solution has high complexity for large M. 

In this paper, we address the above technical challenges 
using a novel bi-convex approximation approach. The first 
challenge is addressed using the interference approximation 
and SINR chance constraint restriction , where we combine 
the techniques in random matrix theory and chance constraint 
optimization to construct a deterministic and (asymptotically) 
conservative approximation for the SINR chance constraint. 
The second challenge is addressed using the semidefinite 
relaxation (SDR), where we apply the SDR technique to obtain 
a relaxed bi-convex problem which does not explicitly involve 
the combinatorial optimization w.r.t. the dimension partition¬ 


ing variable. We further prove the tightness of the SDR using 
the specific structure of the Tx subspace control problem. 
Finally, we propose a low complexity iterative algorithm to 
solve this bi-convex problem and find the Tx subspace control 
solution by combining the convex optimization method and 
the bisection search method. Simulations show that the pro¬ 
posed design achieves significant performance gain compared 
with various state-of-the-art baselines under various backhaul 
signaling latency. 

Notations : The superscripts (-) T and (-)^ denote transpose 
and Hermitian respectively. For a set S, |<S| denotes the 
cardinality of S. The operator diag (a) represents a diagonal 
matrix whose diagonal elements are the elements of vector a. 
For a set of K vectors a., £ C M ,i = 1 and a subset 

S C {1 the notation [a] igS denote a M x N matrix 

whose columns are drawn from the vectors in {a;, i £ S }. The 
notation U MxAr denotes the set of all M x N semi-unitary 
matrices. For a matrix A, diag (A) represents a diagonal 
matrix whose diagonal elements are the diagonal elements of 
A. span (A) represents the subspace spanned by the columns 
of a matrix A. ||A|| is the spectral radius of A. 

II. System Model 

A. Massive MIMO Cellular System 

Consider the downlink of a massive MIMO cellular system 
with L BSs and K single-antenna users as illustrated in Fig. |T] 
for L = 2 and K = 8. Each BS has M antennas with M much 
larger than the number of the associated users. The massive 
MIMO cellular system can be represented by a topology graph 
as define below. 

Definition 1 (System Topology Graph). Define the topology 
graph of the massive MIMO cellular system as a bipartite 
graph Qt = { 13,U ■£}, where B denotes the set of all BS 
nodes, U denotes the set of all user nodes, and £ is the set of all 
edges between the BSs and users. An edge (k, l) £ £ between 
BS node l £ B and user node k £ U represents a wireless 
link between them. Each edge (k, l) £ £ is associated with 
a CSI label h k,i £ C M , which represents the channel vector 
between BS l and user k. For each user node k, let bk denote 
the serving BS. ■ 

An example of a massive MIMO cellular system and the 
corresponding topology graph is illustrated in Fig. [T] 

B. Massive MIMO Channel Model 

The channel from BS l to user k is modeled as = 
&k// 2 hfi, Mk,l, where £ C M has i.i.d. complex entries 
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(a) A massive MIMO cellular system with 2 BSs and 
8 users. 



(b) The corresponding topology graph 
S T (0) = where B = {1,2}, 

U = {1,2,3,4,5,6,7,8}. 

Figure 1: An example of a massive MIMO cellular system and the 
corresponding topology graph. 


of zero mean and unit variance; and 0/,. / £ C MxM is a 
positive semi-definite spatial correlation matrix between BS 
l and user k. Assume block fading channel where h}’ ; is 
fixed for a time slot but changes over time slots. As such, 
the CSI is divided into instantaneous CSI H = {h/,. /} and 
global statistical information © = {©/,. /} (spatial correlation 
matrices). We make the following assumption on the spatial 
correlation matrices. 

Assumption 1 (Spatial Correlation Model). The spatial channel 
correlation matrices satisfy the following two assumptions. 

1) The number of dominant eigenvalues (i.e., the eigenval¬ 
ues that are comparable to the maximum eigenvalue) of 
0/,../,. Vfc, l is relatively small compared to the number 
of antennas M. 

2) The users can be grouped into N user clusters U n ,n = 
1, ...,N such that all users in U n are associated with the 
same BS denoted by l n and & k ^ = L k ^ 0° , Vfc £ U n , 
where L k ~, > 0 is the path loss between BS l n and user 
k, and Tr (0° ) = M is a normalized spatial correlation 
matrix. 

Many experimental measurements as reported in ED- G3 
show that practical MIMO channels indeed have spatial chan¬ 
nel correlation due to limited scattering or LOS transmission. 
The standard channel models such as IMT-advanced channel 
model m also predicts that MIMO channel can have high 
spatial correlation especially when the angular spread is small. 
Furthermore, in a massive MIMO system with a large number 
of antennas assembled within a limited space at the BS, 
the channels are more likely to be highly correlated due to 
insufficient spacing among the antennas 0’ G3- As a result, 
the first assumption has been made in many existing works on 
massive MIMO systems, see, e.g., 0, EHH). In practice, 
we can also cluster users into groups using the user clustering 
algorithm in © such that the second assumption can be 
closely achieved |j8j. Note that the above channel model is 


more general than that considered in ( 8 J. For example, the ra¬ 
th cluster U n is allowed to contain a single user and the users 
in the same cluster can have different path loss. 

At each time slot, linear precoding is employed at BS l n to 
support simultaneous downlink transmissions to the associated 
users in U n . Let fi,t denote the index of the cluster that contains 
user k and let Bk = {ra : ra 7 ^ n-k, ( k,l n ) £ £} denote the set 
of user clusters that interfere with user k. Then the received 
signal for user k can be expressed as: 

Vk = h l,b fc \/^ v feSfe+ h l,6 fc 

k' &tn k \{k} 

S -V-' 

intra-cluster interference 

+ E h I Jn V n P i /2 S n + %, (1) 

n&Bk 

S *' ^ 

inter-cluster interference 

where Sk ~ CAf (0,1) is the data symbol for user k; Vk £ C M 
with |||| = 1 is the precoding vector for user fc; Pk is the 
power allocated to user fc; s n = [s;] ;eW £ <C\ U ^ is the data 
symbol vector at BS ra; V„ = [vj] igW £ C Mx l w "l is the 
precoding matrix at BS ra; P„ = diag (\Pi\i eU ) G I x I 
is the power allocation matrix for the ra-th user cluster; and 
Zk ~ CAf (0, 1) is the AWGN noise. 


C. Two-stage Subspace Constrained Precoding 

The precoder V„ for the ra-th cluster at BS l n is assumed 
to have a two-stage structure V„ = F„ G„. The Tx sub¬ 
space control variable F„ £ \J MxS " is used to mitigate 
the inter-cluster interference (including inter-cell interference) 
in massive MIMO systems, where S n £ {\U n \ , M} is 
called the dimension partitioning variable for the ra-th cluster. 
For convenience, define F = {Fi,...,Fjv} as the set of 
Tx subspace control variables for all user clusters. Both F 
and S n ,\/n are computed at a central node based on the 
global statistical information 0. The inner precoder G„ = 
[g k]k£u ^ C S ’" X I W "I is used to realize the intra-cluster spatial 
multiplexing gain using simple zero-forcing (ZF) precoding. 
Specifically, for a user k £ U n , define hfc jn = Fj,h t j £ C s " 
as its effective channel. Then, the ZF inner precoding vector 
gfc with ||gfc|| = 1 is obtained by the projection of the 
effective channel h k, n on the orthogonal complement of the 


subspace span 


V, 


fc'ew„\{fc} 


. Hence, G„ is computed 


locally at BS l n based on the local real-time instantaneous CSI 

H n = hfc n £ cl w "l xS ". Note that in the proposed 

L ’ J fcew„ 

two stage precoding, only the inner precoder is based on 
ZF techniques for analytical tractability. The Tx subspace 
control is not based on ZF techniques but based on the QoS 
optimization problem V in Section m 

The diversity gain of the users in the ra-th cluster increases 
with the dimension partitioning variable S n . On the other 
hand, the interference leakage to other user clusters also 
increases with As a result, there is a fundamental tradeoff 
between direct link diversity gain and cross link interference 
leakage. It is important to dynamically adapt the dimension 











partitioning variable S n based on the spatial channel corre¬ 
lation matrices © to optimize the overall performance. In 
general, the simple BD-based subspace precoder F„ may 
be far from optimal as will shown in the simulations. We 
shall formulate the design of the Tx subspace control F and 
subspace dimension partitioning S n ,\/n formally in Section 

m 


III. Optimization Formulation for Tx Subspace 
and Dimension Partitioning 


Assume that user k has perfect knowledge of the effective 
single-input single-output (SISO) channel h [. b F^.g/,. and the 
interference-plus-noise power. Then for given Tx subspace 
control F, subspace dimensions { S n }, and instantaneous CSI 
H, the SINR of user k is 


SINRfc 


Pk 


l f o.-- 

l k,b k *■ n k 




E 


ut 

neBfc n k,l 


F„G„P„GlFji.h 


k,l n 


( 2 ) 


where G„ is the ZF inner precoder given by a function of 
F n and H (or more precisely, the effective channels H n ). In 
this paper, we focus on designing Tx subspace control F to 
optimize the QoS of the users. Specifically, we consider the 
following QoS optimization problem: 


a simple admission control based on the solution of Problem 
V. For example, whenever a new user arrives, the admission 
control solves Problem V with all users (including the newly 
arrival user). If Problem V is feasible, this user is admissible. 
Otherwise, this user is not allowed to access the system. As a 
result, we can focus on the case when Problem V is feasible. 

Problem V is a very challenging problem. First, the prob¬ 
ability functions in the SINR chance constraint ([3]) do not 
have closed form expressions. Thus, one may only resort 
to approximation methods. Due to the two-stage precoding 
structure, the inner ZF precoder G ra is also a function of 
the Tx subspace control variable F„. As a result, the exist¬ 
ing approximation methods for the “standard form” chance 
constraints in (9|, |10j cannot be applied here. Second, the 
dimension partitioning variable S n also needs to be optimized, 
which is a difficult combinatorial problem. Third, the domain 
IjAf x S„ of F n is non-convex. To tackle the above challenges, 
we propose a novel bi-convex approximation approach, which 
consists of three key ingredients: random matrix theory, chance 
constrained optimization and semidefinite relaxation (SDR). 
Then we propose an efficient algorithm to find the optimal 
solution of the resulting bi-convex approximation problem. 

IV. The Bi-convex Approximation Approach 


V : max 7 

F,{S„},7>7° 

s.t. Pr {SINRfc > w kl } > 1 - e fe ,Vfc; (3) 

F„ eU MxS »,S„ £ {\U n \,...,M},Vn, (4) 

where the SINR chance constraint in ensures that the 
probability of SINR/, not satisfying a service requirement 
Wfc 7 is less than a maximum allowable outage probability 
£ (0,1), and w k > 0 is the QoS weight for user k. The 
constraint 7 > 7 0 is used to guarantee some minimum QoS 
for all users. The constraint S n > \U n \ is to ensure that the ZF 
inner precoding is feasible at each BS. The QoS constraint in 
(|3]i is especially useful for media streaming applications where 
there is a stringent delay requirement for the media packets 
requested by the users and thus it is desirable to maintain a 
fixed data rate (with high probability) for each user. The QoS 
weights w k s and the maximum allowable outage probabilities 
efc’s can be used to provide differential QoS for different users. 

Note that the optimization of the Tx subspace control 
variable F„ in Problem V includes the optimization of both 
the dimension partitioning variable S n (or equivalently, the 
dimension of F n ) and the value of F„. Both F„’s and S n ’s 
are adaptive to the spatial channel correlation matrices 0 , i.e., 
they are fixed and independent of the instantaneous CSI H for 
given ©. 

Remark 2 (Admission Control). Note that Problem V may not 
always be feasible, i.e., the minimum QoS requirement WkJ° 
cannot be guaranteed for all users even with infinite transmit 
power. In practice, an admission control can be used to ensure 
that there is enough resource in the system to guarantee 
the minimum QoS for all users. The admission control is a 
challenging problem and a systematic design of admission 
control is out of the scope of this paper. However, we can use 


Since it is difficult to solve Problem V directly, we propose 
to find a sub-optimal solution by solving a bi-convex approxi¬ 
mation of V. The proposed bi-convex approximation contains 
three steps, where each step solves one key technical challenge 
associated with Problem V. 

A. Interference Approximation Step 

To handle the SINR chance constraint in (|3j, we need to 
find a simple characterization of the distribution of SINR/,., 
which is very difficult because the interference term //. = 
EneB*. h l./ ri F « G " P « G n F « h fcJ„ in SINR fc depends on the 
instantaneous channel vectors h ; . t , n £ B k of all the cross 
links of user k. To simplify the characterization of SINR/., 
the interference approximation step is used to address the 
following challenge. 

Challenge 1 (Deterministic Approximation of //.). Find a 
deterministic approximation I k of I k such that I k does not 
depend on the instantaneous channel vectors h fc ,n £ B k 

of the cross links of user k. _ 

We resort to the random matrix theory to solve the above 
challenge. Specifically, we apply the random matrix theory 
to derive an asymptotic (and deterministic) upper bound I k 
of I k as M —t 00 and \U n \ —t 00 ,Vn. We then use 
It- as an approximation of I k for finite but large M and 
|Z4|’s. Throughout the paper, the notation M —>■ 00 refers to 
M —> 00 and \U n \ —> 00 , Vn such that 0 < liminf \U n \ IM < 

M—f 00 

lim sup \U n \ /M < 00 . The following assumptions are re- 

M—f 00 

quired in order to derive the asymptotic upper bound of I k . 

Assumption 2 (Technical Assumptions for Interference Ap¬ 
proximation). All spatial correlation matrices have 








uniformly bounded spectral norm on M, i.e., 


limsup sup || ©fc,z || < oo, ML (5) 

M—foo 1 <k<K 

Assumption [2] is satisfied by many MIMO channel model 
and it is a standard assumption in the literature, see e.g., G3- 

m- 

Under Assumption [2] we have the following theorem. 

Theorem 3 (Asymptotic Interference Upper Bound). Under 
Assumption [2] and a feasible Tx subspace control F (i.e., F 
satisfies the constraints in 0 and 0 for some 7 > 7 0 ), we 
have Ik ~ Ik < 0 as M —> 00 , where 

h = E PnTr (Ft 0 fc j B F n ) . 

neBfc 

and P n = YYk' eu E;' ^ le sum transmit power of the users 

in the n-th cluster. 


Please refer to Appendix [A] for the proof. 

Replacing the interference term Ik in SINR/, using the 
asymptotic interference upper bound Ik in Theorem pi we 
obtain an asymptotically safe approximation of the SINR 
chance constraint in ([3]0| 


Pr 


h 


t 

k,bk 


F rik gA; 



> 1-Cfc. 


( 6 ) 


B. SINR Chance Constraint Restriction Step 

The asymptotically safe approximate SINR chance con¬ 
straint in 0 remains intractable although it appears to be 
relatively easier to handle than the original counterparts in 
0. The restriction step aims to find a quadratic conservative 
approximation of 0- There are some existing methods that 
use the worst-case deterministic constraints to approximate 
various forms of chance constraints ®, (TU), (T9). For exam¬ 
ple, in the context of transmit beamforming design for MU- 
MIMO downlink with imperfect CSI 1101, the Bernstein-type 
inequality was used to construct a convex restriction of the 
SINR chance constraint that involves a quadratic function of 
the standard complex Gaussian vector. Unlike the conventional 
transmit beamforming problem, in our problem, the MIMO 
precoder is divided into Tx subspace control F n and inner 
precoder G„ which change at different time scales. As a result, 
the asymptotically safe approximate SINR chance constraint 
in ([ 6 ]) does not belong to any of the standard forms that have 
been studied in the existing works. For example, the inner 
precoding vector g/, : is also a function of the random channel 


vectors and thus 


-.t p 

k,b k n k 




is not a quadratic function 


of the standard complex Gaussian vector. In our solution, the 
restriction step entails finding a solution to the following: 


Challenge 2 (Asymptotically Quadratic Restriction of SINR 
Chance Constraint 0). Find a quadratic function f/,. (F) 
from F to R 2 such that if f/ (F) > 0, then the SINR chance 
constraint in (|3j> is asymptotically satisfied as M —> 00 


'Asymptotically safe approximation means that if is satisfied, then the 
SINR chance constraint in is satisfied as M —> 00 . 


Our strategy is as follows. First, we show that conditioned 


on H_*. = 


h fc',rj. fe J \{fc}’ 

2 

of user k can be expressed as a quadratic 
'unction of a standard complex Gaussian vector. Then we 
apply the standard method in |j9jj, JTO] to construct a quadratic 
restriction of the chance constraint in 0 conditioned on a 
given H k . Finally, we use the quadratic restriction of the 
conditional chance constraint of 0 with the “worst case” H_*. 
as a quadratic restriction of the unconditional chance constraint 
in ( 61 . Since 0 is an asymptotically safe approximation 
of (3 1 , the result will then be an asymptotically quadratic 
restriction of the SINR chance constraint in 0. The final 
result is summarized in the following theorem. Please refer 
to Appendix [B] for the detailed proof. 

Theorem 4 (Solution to Challenge0. The asymptotically safe 
approximate SINR chance constraint in 0 is satisfied if 


the effective channel gain 


p 

l k,b k n k 


g k 




<?k 




Tr 


(®s 


F F 


n k n k 


M e ; 

) ^ a k + l^s fc | ^ 1) 


> 7 E„ e g fc Tr (®fc,nF„Ft) + c fc ( 7 ). 


(7) 


where S k = In a k = v/2 ^+^ 2l5fc + -, 
is the largest eigenvalue of ©fc,n = 

«W=I=ffr + ( 1 - i f I ) (1^.1 -!)• 



WkPn 

**k P k 
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k,l n 


and 


be in 


In (71, Tr 


■ (&- F F f ^ 

\ yJ n k F n k E n k J 


can 


Tr (© fc , n F„Ft; 

as the signal power of the direct link and 


and E n& B k 


erpreted 

interference leakage from the cross links. Hence, the quadratic 
restriction of 0 in 0 captures the key tradeoff between direct 
link signal power and cross link interference leakage. 


Remark 5. The asymptotically quadratic restriction in 0 is 
also valid under finite M. The asymptotic approach based 
on random matrix theory has been widely used in wireless 
communications especially when it is difficult to directly 
analyze the original system and the validity of such approach 
has been verified by numerous works. In this paper, we utilize 
random matrix theory and obtain 0 as a safe approximation 
of the original SINR chance constraint when M —► 00 . It has 
been shown in | 20 ;| that the random matrix theory results are 
pretty good approximation even when M is small. Hence, 0 
is a safe approximation of the original SINR chance constraint 
even under finite M in massive MIMO (as illustrated in Fig. 
[3] in Section [VT|i. 

According to Theorem [3] and Theorem [4] 0 is a quadratic 
restriction of the original SINR chance constraint in 0 for 
large M. Replacing the SINR chance constraint 0 in V 
with the quadratic constraint in 0 , we have the following 
asymptotically conservative formulation of Problem V: 


V\ : max 7 

F.{S„}, 7 >7° 

S.t. 0 is satisfied for all k, 

F n sU MxS »,5„ € {\U n \ ,...,M},Vn. ( 8 ) 






















C. Semidefinite Relaxation Step 

By combining the techniques in random matrix theory 
and chance constraint optimization, we have constructed an 
asymptotically quadratic restriction for the complicated SINR 
chance constraint in Q. However, we still need to solve the 
challenge associated with the combinatorial optimization of 
the dimension partitioning variable S n and the semi-unitary 
constraint on F„. 


Problem V is a bi-convex problem, i.e., it is convex w.r.t. 
W ( 7 ) for fixed 7 (W), but is not jointly convex. Note that 
Problem V does not involve the combinatorial optimization 
w.r.t. the dimension partitioning variable S n because the 
optimal S n is automatically determined by the rank of the 
optimal W„. To completely solve Challenge [3] we further 
prove the equivalence between V and V\ in the following 
theorem. 


Challenge 3 (Dimension Partitioning and Semi-unitary Con¬ 
straint). Find a bi-convex problem which is equivalent 
to Problem V\ and does not involve the combinatorial 
optimization w.r.t. the dimension partitioning variable S n . 


In the following, we apply SDR to solve the above chal¬ 
lenge. It is easy to see that Problem V\ is equivalent to the 
following problem 

V 2 '■ max 7 

W,7>7° 


S.t. 


> 7 Enes k Tr (0fc,„W n ) + c k ( 7 ) ,Vfc; 


(9) 


W n = W^Tr(0„W„ 


(«X) 


> Vn, w n 7 0,Vra, (10) 


where W = {Wi,...,Wjv} with W„ = F n F^ e 
C MxM ,n = 1 and rj n = max te » n + \U n \ - l). 

The constraint in © is derived from the constraint <[ 8 ]> in 
V\ and the second constraint in (JT) using the fact that W„ 
can be expressed as W„ = F„f| with F ri £ U MxSr * 
and only if W n = W;(,Tr(W„) = S n ,W n 7 0 (Note that 
Tr ^0° W«) — Vn implies that Tr (W n ) > \U n \). For any 
feasible solution W of V 2 , we can calculate the corresponding 
feasible solution F = {Fi,...,Fjv} and S n ,fn of V\ using 
eigenvalue decomposition (EVD) W„ = F n Ig n FT,Vn and 
S n = Tr(W„) ,Vn respectively. 

Problem V 2 is still difficult to solve due to the non- 


convex constraint W„ = W? in ( [ 10 ] ) and the bi-linear term 
Tr f@fc in W„) in (|9|. To make the problem tractable. 


neBk 


7E 

we apply SDR to obtain a convex relaxation of the constraint 
W„ = Wfj as W„ — Wj; 7 0, which is further equivalent 
to the following convex constraint 


I M 

w„ 


W„ 

w„ 


A 0 . 


( 11 ) 


Then by replacing the constraint W„ = with the relaxed 
constraint in ( [TT] ) and removing the constraint 7 > 7 0 , we 
obtain a relaxed problem of V 2 as 

V : max 7 

W,7 

afcTr > 

7EnGt3 fc Tr (®fc,nW„) + Cfc (j) ,Vfc (12) 
Tr(©°W „)>Th,Vn (13) 

7 0, W„ y 0, Vn (14) 


s.t. 


IM 

w n 


W n 

w„ 


where a k = ^1 — 


V'2<5fc 

(7k 


Theorem 6 (Equivalence between V and V\). The optimal 
solution W*, 7 * ofV satisfies W*—W * 2 = 0 , Vn. Moreover, 
ifV i is feasible, then 7 * > 7 0 and thus W*, 7 * is also optimal 
for Vi. 

Please refer to Appendix [C] for the proof. 


D. Optimal Solution of Problem V 


In this section, we propose an iterative algorithm named 
Algorithm BCA to solve V by combining the convex opti¬ 
mization method (for the optimization of W) and the bisection 
search method (for the optimization of 7 ). We first propose a 
low complexity algorithm to solve V with fixed 7 based on 
the Lagrange dual method. Then we summarize the overall 
Algorithm BCA and establish its global convergence. 

1) Lagrange dual method for solving Problem V with fixed 
7 : For fixed 7 , problem V is convex and thus can be solved 
by the standard convex optimization method |2T). However, 
the computation complexity of standard convex optimization 
method is still quite high as the number of antennas M 
becomes large. Based on the Lagrange dual method, we 
exploit the specific structure of the problem to propose a 
low complexity algorithm which can significantly reduce the 
computation complexity over the standard convex optimization 
method. 

The Lagrange function of Problem V with fixed 7 is 

N 

L (ft, v, W) = 7 + 5>"(Tr(®^W n ) -Vn) + 

n= 1 


K r 

X Pk afcTr (e 


fc= 1 


0 Bi w St 


7 XI Tr (®fe,« W «) - Cfc, 1 - Cfc, 27 ) 

nGB k 


where p = \pk ] k=1 , 


,K 


is the Lagrange multiplier vec¬ 


tor associated with the constraint in ( 121 , v = [i/„] n _i N € 
™' is the Lagrange multiplier vector associated with the 




constraint in (113]), Cfc,i = a k ( \Un k \ — 1) and c k , 2 = 


Wk 


TV 


Note that L(n, i/,W) is a partial Lagrangian since < 14 1 is 
kept as a separate constraint. The dual function of Problem V 
with fixed 7 is 


J (/it, u) = max L (/x, 17 W), s.t. (141 is satisfied, 
w 

The corresponding dual problem is 


(15) 


min J (/x, u ), s.t. p > 0, v > 0. (16) 

11,V 

Note that L (/x, 17 W) = E^Li Tr (A„W„) + c, where 

A n = EfcGW„ Pkttk®n k — 7 E k eu n Pk®k,n + t' n & n , lA n = 
















[k : fLk ^ n, ( k,l n ) £ £}, and c is independent of W. Then 
the maximization problem in (15 i can be decomposed into N 
independent problems as 


max Tr (A n W n ), s.t. W„ - W„ y 0,W„ y 0, 

W„ 


(17) 


for n = 1, TV. Let A n = U n D„IJl be the EVD of A ra and 
let U* denote the eigenvectors corresponding to the positive 
eigenvalues of A n . Then for fixed fi. is, it can be shown that 
Problem (17) has a closed form solution given by 

w;(/*,i/) = u;u;t. (is) 


Since Problem V with fixed 7 is convex, the optimal solution 
is given by W*(/x*,iC) = {W* (/x*, is*) : Vn}, where 
(fi*, is*) is the optimal solution of the equivalent dual problem 
in | [T 6 | ). 

The dual function J (/x, is) is convex and V./ ( 7 /. is) = 
[d Ml J (/x, is),..., dfj, K J (/x, is), d Vl J (/x, is),..., d„ N J (n, is)] T 
is a subgradient of J (/x, is) at (/x, is), where 

[li,is) = OfcTr (/x, is)^j — CkXi 

-7 ^ Tr(0 fci „W:(/x,ix)) -c feil ,Vfc 

neB fc 

d„ n J {n, is) = Tr ^0° W* (fi,is)J — r) n ,Vn (19) 


Hence, the standard subgradient based methods such as the 
subgradient algorithm in |22) or the Ellipsoid method in [211 


can be used to solve the optimal solution (/x*, is*) of the dual 
problem in (fl 6 |). 


Remark 7. For Problem V with fixed 7 , the Lagrange dual 
method is a better choice than other standard convex opti¬ 
mization methods such as interior point method for two 
reasons: 1 ) the number of primal variables in this problem is 
much larger than the number of dual variables ( NM 2 + 1 
versus K + N with M K and M TV); 2) for 
fixed Lagrange multipliers /x, is, the optimal primal variables 
W* (/x, is) have closed form solution. Hence, although the 
convergence speed of the simple subgradient and Ellipsoid 
algorithms for the optimization of dual variables is not as fast 
as the more complicated algorithms, the overall computation 
time (to reach a given accuracy) of the Lagrange dual method 
is much smaller than other standard convex optimization 
methods. 


2) Summary of the Overall Algorithm for solving Problem 
V : Algorithm BCA is summarized in Table [n] and the global 
convergence is established in the following theorem. 


Theorem 8 (Global Convergence of Algorithm BCA). For 
fixed tolerable error e, Algorithm BCA terminates after I e = 
log 2 iterations. Let W e = W( /e ), 7 £ = 7 HH 

denote the solution found by Algorithm BCA with tolerable 
error e. Then we have ( 7 s — 7 *) < e. Moreover, as e 0, we 
have W e -7 W* and 7 ® -7 7 *. 


Sketch of Proof: Algorithm BCA ensures that 7 * £ 

[ 7 min, 7 max] a fter each iteration. Moreover, after the i-th itera- 

l—O _ O I , .N 

tion, we have | 7 max - 7 min | < -_ 2 i ' and 7 W £ [ 7 min j Tmax] • 

Hence, we have I 7 W — 7 *| < ^ L from which Theorem 
[ 8 ] follows immediately. ■ 


Table II: Algorithm BCA (for solving Problem V) 


Initialization: Choose 7 m ; n = 7" and 7 rmx = 7" > 7“ such 
that Problem V with fixed 7 = 7 0 is infeasible. Let i = 1. 
Step 1: Let 7 ^ = 7min + 7m " . Solve Problem V with 
fixed 7 = us ing the Lagrange dual method 


in Section IV-D1 
Step 2: If a feasible solution, denoted by is found 

in Step 1, let 7 m i„ = 7^7 otherwise, let 7 max = 7 ( ^. 
Step 3: If | 7 max — 7mi n | < e, where e > 0 is 

the tolerable error, terminate the algorithm. 
Otherwise, let i = i + 1 and return to Step 1. 


Asymptotically quadratic restriction of the 
SINR chance constraint 



Figure 2: Illustration of the relationship between different problems. 


E. Summary of the Overall Solution 

The relationship between problems V, V \, V > and V is 
summarized in Fig. [2] and is elaborated below. 

• Relationship between V and V\ : Problem V\ is ob¬ 
tained by replacing the original SINR chance constraint 
in ([3]) with its asymptotically quadratic restriction in (|7]i. 
Hence, Problem V\ is an asymptotically conservative 
formulation of the Problem V, i.e., the solution of V\ is a 
feasible solution to the original problem V for sufficiently 
large M. 

• Relationship between V and V: By applying the SDR 
technique, we obtain Problem V as a relaxation of 'P\. 
By Theorem [b] such relaxation is tight and thus V is 
equivalent to V\. Hence, V is also an asymptotically 
conservative formulation of the Problem V. 

Then we summarize the overall solution. First, Algo¬ 
rithm BCA is performed to calculate the optimal solution 
W*, 7 * (up to some tolerable error e) for Problem V. Then 
we calculate the corresponding Tx subspace control F* = 
{F*,.... F^} using EVD W* = F*I s *F*t, Vn, where S* = 
Tr(W*). Finally, we output F*,^*} , 7 * as an approximate 
solution to the original Problem V. Note that F*. {.S'*} , 7 * 
is usually a conservative solution in the sense that the actual 
outage probability of each user k under F*, {5 1 *} , 7* is much 
lower than the maximum allowable outage probability r k . 

Remark 9. In this paper, we focus on the optimization of 
Tx subspace control F and subspace dimensions { S n } for 
fixed power control {Pk} (problem V), which is the key of 
the two stage precoding design for massive MIMO systems. 
The results and algorithm in this paper also provides a basis 


























for solving the more complicated joint optimization of Tx 
subspace control and power control { P k }. For example, we 
may design an alternating optimization (AO) algorithm to 
solve the joint optimization problem. In each iteration of the 
AO algorithm. Algorithm BCA is used as a component to 
find the optimal Tx subspace control for fixed power control, 
and then the optimal power control is obtained for fixed Tx 
subspace control. 


F. Methods to Improve the Performance Over the Consen’ative 
Solution 


One common drawback of using the worst-case determinis¬ 
tic constraint to approximate chance constraint is that, the ob¬ 
tained solution is usually too conservative 03- In the context 
of the probabilistic SINR constrained beamforming problem, 
the bisection refinement method has been proposed to mitigate 
the conservatism due to the deterministic restriction of the 
probabilistic SINR constraint G3- This bisection refinement 
method can also be applied in our problem to improve the 
performance of the conservative solution obtained in Section 
|IV-E| However, the bisection refinement method in JTO] re¬ 
quires solving Problem V for multiple times, which increases 
the computation complexity. To address this problem, we 
propose a simple non-iterative refinement method to determine 
the proper value of 5 k - First, we calculate the simple BD-based 
subspace control F BD using baseline 3 in Section VI Then, we 
let 7 BD = <I BD (efc) /wk,fk, where >I> BD is the empirical CDF 
of the SINR of user k, denoted by SINR BD , under BD-based 
subspace control F BD , and <b BD is the inverse function of <I> BD . 
Note that Pr {SINR^° > w k 7 BD } « l-e k . Finally, 5 k is cho¬ 
sen as the largest number that satisfies ([ 7 ]) with F = F BD and 
7 = 7 BD . Simulations show that the non-iterative refinement 
method achieves almost the same performance as the much 
more complicated bisection refinement method. 


V. Implementation Considerations 
In this section, we discuss various implementation issues, 
such as how to obtain the statistical and real-time CSI, the 
signaling overhead and the impact of CSI error. In the follow¬ 
ing analysis, we assume that the spatial channel correlation 
matrices © change every T time slots and the rank of each 
spatial correlation matrix is R. 


A. Slow Statistical CSI Acquisition 

Under typical scenario, the spatial channel correlation ma¬ 
trices change after a few seconds and the time slot duration is 
in the order of millisecond, i.e., T is usually in the order of 
a few thousands. For example, in urban areas with a carrier 
frequency of 2GHz and a user speed of 3km/h, the spatial 
channel correlation matrices remain unchanged for about 22 
seconds according to the measurement results reported in 
||23||. Hence, the statistical CSI acquisition can be done at 
a much slower timescale compared to the real-time CSI 
acquisition. Specifically, we propose a rank deficient oracle 
approximating shrinkage (OAS) estimator which is able to 
obtain an accurate estimation of the spatial correlation matrix 
using only T p = O ( R ) -C M independent channel samples. 


Rank deficient OAS estimator: 

Input: T p independent channel samples h (j) £ C M . j = 
1 ,...,T p of a random vector h £ C M with covariance matrix 

0 . 

Step 1: Using Gram-Schmidt process to obtain a semi 
unitary matrix U £ C M x R whose columns form an orthogonal 
basis for h (7 ) £ C M ,j = 1, ...,T p . 

Step 2: Form T p independent channel samples with reduced 
dimension h (j) = U'h (j) £ C R : j = 1 ,...,T p . Note that 
h (j)’s can be viewed as realizations of a random vector h £ 
C R with covariance matrix U^0U. 

Step 3: Use the OAS estimator in [ [24] with independent 
samples h (j) , j = 1,..., T p to obtain an estimation 0 of the 
covariance matrix of h . 

Step 4: Output the estimation of the covariance matrix of 
h as: © = U©'Ut. 

Then the statistical CSI acquisition is summarized as fol¬ 
lows. Within each T time slots, choose T p time slots such 
that the gap between any two selected time slots is larger than 
channel coherent time. Then each BS transmits M dedicated 
pilot symbols in each of the selected T p time slots for the 
users to estimate the spatial correlation matrices using the rank 
deficient OAS estimator. Finally, user k sends the non-zero 
eigenvalues and the corresponding eigenvectors of & kt i,fl to 
the associated BS. 


B. Real-time CSI Acquisition 

The dimension of the effective channel vector of a user 
in the n-th user cluster is equal to S n ( the rank of the n- 
th Tx subspace control variable F n ), which is O (1) and is 
much less than M. To estimate the effective CSI H„ of the 
n-th user cluster, the pilot symbols need to be precoded using 
the n-th Tx subspace control variable F n since the effective 
CSI H„ is restricted in the subspace spanned by F n . As a 
result, we can simultaneously transmit N pilot symbols at 
a time for N user clusters because the Tx subspace control 
F can also mitigate the inter-cluster interference in the pilot 
transmission stage. The total number of orthogonal pilot sym¬ 
bols required for effective CSI estimation H„,n = 1 
is max ra S n = 0(1). Hence, by treating F„ as part of the 
effective channel, the estimation of the effective CSI H n is 
similar to the conventional channel estimation problem in a 
single-cell small-scale MIMO system and the conventional 
CSI signaling method in LTE can be used to estimate the 
effective CSI. 


C. Statistical and Real-time CSI Signaling Overhead 

The CSI signaling overhead includes the pilot symbol 
overhead (the average number of orthogonal pilot symbols for 
real-time and statistical CSI estimation) and the uplink CSI 
feedback overhead (the average number of feedback vectors 
with different dimensions). For simplicity, assume that each 
BS is associated with Kq users, each user is interfered by 
L 0 — 1 neighbor BSs, and S n = S, "in. Then the real¬ 
time CSI signaling overhead of the proposed scheme is “S 
PS, Kq C s ” per time slot per cell, which means that, the 





proposed scheme requires transmitting S independent pilot 
symbols and feed backing Kq complex channel vectors of 
dimension S per time slot per cell. For each spatial correlation 
matrix, the corresponding user only needs to send R non¬ 
zero eigenvalues and the corresponding eigenvectors to the BS. 
Hence, the statistical CSI signaling overhead of the proposed 
scheme is PS, C M , per time slot 

per cell, which means that, the proposed scheme requires 
transmitting T '' independent pilot symbols, feed backing 
k °l° r complex vectors of dimension M (eigenvectors of 
spatial correlation matrices), and feed backing KiRji ' real 
vectors of dimension R (eigenvalues of spatial correlation 
matrices) per time slot per cell. As a comparison, the CSI 
signaling overhead of the conventional single-cell MU-MIMO 
precoding scheme is “M PS, Kq C m ” per time slot per cell. 
Since T^T p = 0(R) = O (L 0 ) = O (1), the overall CSI 
signaling overhead of the proposed scheme is significantly 
smaller as will be shown in the simulations. 

D. Backhaul Signaling Overhead 

Suppose there are a total number of N fj links (including 
both direct links and cross links) in the network. The BSs 
need to send R non-zero eigenvalues and the corresponding 
eigenvectors of every spatial correlation matrix to the central 
node within each T time slots. Hence, the backhaul signaling 
overhead is C M , ^ R” per time slot (i.e„ ^ 

complex vectors of dimension M and ' /l ‘ real numbers per 
time slot). Since T R is usually true for massive MIMO 
systems, the backhaul signaling overhead can be greatly 
reduced compared to the cooperative MIMO or centralized 
coordinated MIMO where the BSs need to send N k complex 
channel vectors of dimension M to the central node at each 
time slot. 

E. Summary of Practical Real-time Implementation Aspects 

In the proposed two stage precoding technique, there are 
four components (steps) working at different timescales. 

Step 1 (Slow Statistical CSI Acquisition): First, the BSs 
need to obtain the spatial channel correlation matrices using 
the method in Section [V-A| and then send them to the central 
node. 

Step 2 (Tx Subspace Control Optimization): With the 
obtained statistical CSI from the BSs, the central node calcu¬ 
lates the optimal Tx subspace control variables {F „} using 
Algorithm BCA and sends F„ to BS l n (which is the BS 
associated with the n-th user cluster). 

Step 3 (Real-time CSI Acquisition): By treating F n as 
part of the effective channel, the estimation of the real¬ 
time effective CSI H„ is similar to the conventional channel 
estimation problem in a single-cell small-scale MIMO system. 

Step 4 (Inner Precoding): At each time slot, BS 
performs inner precoding using the obtained effective CSI H„ . 

Both the statistical CSI acquisition and the Tx subspace 
control optimization can be done at a much slower timescale 
compared to the time slot duration. Hence, the proposed 
solution is not sensitive to the backhaul signaling latency 
delay and computational delay caused by the Tx subspace 


control optimization. On the other hand, the effective CSI 
acquisition and inner precoding per cluster is similar to that 
in the single-cell small-scale MIMO system and thus they can 
be implemented in real-time at each time slot. Note that in 
practice, it is impossible to obtain perfect effective CSI at each 
BS due to the channel estimation/feedback error. The impact 
of imperfect effective CSI on the performance is similar to 
that in the conventional small-scale MIMO systems with ZF 
precoding. As a result, we can also employ MMSE techniques 
for the inner precoder to mitigate the effects of CSI errors. The 
theoretical analysis of the effect of imperfect effective CSI on 
the two stage subspace constrained precoding is left as an 
interesting future work. 

Remark 10 (Extension to Frequency Selective Fading Chan¬ 
nels). For clarity, we assume flat fading channel in this paper. 
In the wideband systems (such as OFDM) with frequency 
selective channels, the spatial correlation matrices are approxi¬ 
mately identical on different subcarriers |25| , (26] |. As a result, 
the same Tx subspace control F can be applied to different 
subcarriers and the results and algorithm can also be extended 
to the wideband systems. The extension to wideband systems 
depends on the specific QoS requirements. If we require that 
the SINR of each user on each subcarrier must be larger than 
a threshold with high probability, then the proposed solution 
can be directly applied. However, if we require that the rate 
of each user (rate summed over all subcarriers) must be larger 
than a threshold with high probability, then the extension may 
be non-trivial. 


VI. Simulation Results 


A. Simulation Setup 

Consider a massive MIMO cellular system with 19 cells, 
where a reference cell is surrounded by two tiers of (inter¬ 
fering) cells. The inter-site distance is 500m. In each cell, 
there are K f) users and 2 uniformly distributed hotspots 
(user clusters). Each hotspot contains K c users and the other 
K 0 -2K C users are uniformly distributed within the cell. Each 
BS is equipped with M antennas. The path gains T/. /’s are first 
generated using the path loss model (“Urban Macro NLOS” 
model) in [ [27] . Then the spatial correlation matrices &k /s 
are generated according to the local scattering model in (6j, 
0 - Specifically, the correlation between the i, m-th channel 
coefficients of h k i is given by 

^ max 

[®k,l] im = f V’k,/ (p) eJ ™‘ ^dip, ( 20 ) 

’ /, ~min 

J fk.i 


where p denotes the AoD, ip k ,i (p) denotes the power gain 
for the path specified by the AoD p and (p) is the 

phase difference between the i-th and m-th BS antennas on 
the path specified by p. In the simulations, (p) in (201 
is generated using a uniform linear array with the antenna 
spacing equal to half wavelength, and f>k t i ( p ) = S 

Pk'iiPkT ' where tp kl is chosen such that Tr = 

ML k ,i and are randomly generated such that 1) 




= ? , 2) if user k and user k belong to the 













same hotspot, we have p'f'f = We assume 

the spatial channel correlation matrices © change every 1000 
time slots. 

In the basic simulation setup , the per BS transmit power is 
Pb = lOdB and the other system parameters are set as Kq = 7, 
K c = 3 , M = 40, = 0.05, Vfc and Wk = Lk,b k ,Vk; the 

transmit powers of the users are given by Pk = j^,\/k. In 
the simulations, we will change the system parameters in the 
basic setup to study the impact of different system parameters 
on the performance. We only evaluate the performance of the 
reference cell to avoid “unsymmetric edge effects”. 

We compare the performance of the proposed solution with 
the following baselines. 

• Baseline 1 (FFR): Fractional frequency reuse (FFR) |28| 
is applied to suppress the inter-cell interference. There are 
6 subbands, 3 of which are used for the inner zone and 3 
are used for the outer zone. The radius for the inner zone 
is set as 150m. In each cell, ZF beamforming is used to 
serve the users on each subband. 

• Baseline 2 (Clustered CoMP): 3 neighbor BSs form a 
BS cluster and employ cooperative ZF |5) to simultane¬ 
ously serve all the users within the BS cluster. 

. Baseline 3 (JSDM-PGP): This is the two-stage precod¬ 
ing scheme proposed in (8]|, where the per-group process¬ 
ing (PGP) with approximate BD is employed at each BS. 
In the approximate BD, the pre-beamforming matrices F 
are chosen to satisfy F^U* j = 0, Vn G Bk,Vk, where 
j £ U Mxr k,i i s a semi-unitary matrix collecting the 
dominant eigenvectors of &k,i- The number r* k ( of dom¬ 
inate eigenvectors is chosen such that ; < 

—20dB < (r k where (m) is the m-th largest 
eigenvalue of &k,i in dB. Simulations show that such 
choice of r k ; can achieve a good performance. Both 
ZF inner precoder and regularized zero-forcing (RZF) |2j 
inner precoder will be simulated. 


B. Verify the SINR Satisfaction Probability 

First, we verify that the original QoS requirement in Q is 
satisfied by the proposed solution under finite M. Note that 
the QoS requirement in (3]i means that the minimum SINR sat¬ 
isfaction probability minfcPrjSINRfc > 7} (conditioned on a 
given spatial correlation matrices 0) is larger than some target 
value 1 — £fc. Fig. [3] shows the histograms of the minimum 
SINR satisfaction probabilities over different realizations of 
© under the basic simulation setup (M = 40). To obtain 
the histograms, we generated 100 realizations of ©. Then, 
for each realization of ©, the minimum SINR satisfaction 
probability is numerically evaluated using 10000 randomly 
generated realizations of instantaneous CSI H. Fig. ^validates 
that the proposed solution indeed achieve a minimum SINR 
satisfaction probability no less than the required target 95%. 
The “probability gap” in Fig. [3] refers to the difference 
between the mean minimum SINR satisfaction probability 
and the target SINR satisfaction probability. Without using 
the refinement methods, the proposed solution is conservative, 
e.g., the “probability gap” is 0.042. However, with the simple 
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Figure 3: Histograms of the minimum SINR satisfaction probabili¬ 
ties under the basic simulation setup. The red dash line indicates the 
target SINR satisfaction probability and the green dash line indicates 
the mean minimum SINR satisfaction probability. 


non-iterative refinement method, the proposed solution become 
“tighter” (i.e., the “probability gap” is smaller), and with the 
more complicated bisection refinement method in (TO), the 
proposed solution can be even tighter. 

C. Comparison of Outage Throughput Performance under 
Different System Parameters 

In Fig. [4] we plot the outage throughput of different schemes 
under the basic simulation setup, where the outage throughput 
of user k is defined as the maximum data rate that can 
be achieved with a probability no less than 1 — <7. For 
baseline 2, the 3 cooperative BSs need to exchange CSI and 
payload data, and thus there is CSI delay when the backhaul 
latency is not zero. When there is CSI delay, the outdated 
CSI is related to the actual CSI by the autoregressive model 
in (29) with the following parameters: the user speed is 3 
km/h; the carrier frequency is 2GHz. For baseline 3, ZF inner 
precoder is simulated. It can be seen that the performance of 
the proposed scheme with the simple non-iterative refinement 
method is close to that with the more complicated bisection 
refinement method in m- The outage throughput of the 
proposed scheme is larger than baseline 1 and baseline 3. 
Although the performance of baseline 2 is promising at zero 
backhaul latency, the performance quickly degrades at 10ms 
backhaul latency. 

In Fig. 0 we change the number of antennas to M = 64. 
The other simulation setup is the same as the basic simulation 
setup. Compared to Fig. [4] with M = 40 antennas, the 
performance of all schemes improves as the number of anten¬ 
nas increases. However, the performance improvement of the 
proposed scheme and baseline 3 is larger. This is not surprising 
since the proposed scheme and baseline 3 are specifically 
designed for massive MIMO systems with spatially correlated 
channel. 

In Fig. [6] we change the number of users in each cell to 
I\q = 8 and the number of users per cluster to K c = 4. 
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Figure 4: Outage throughput comparisons of different schemes under 
the basic simulation setup. 


Figure 7: Outage throughput comparisons of different schemes with 
heterogeneous QoS requirements. The other simulation setup is the 
same as the basic simulation setup. 



3 becomes larger. This shows that the proposed solution can 
better handle the heterogeneous QoS requirements. 

Under all simulation setups, the proposed scheme outper¬ 
forms baseline 1 and baseline 3, as well as baseline 2 under the 
practical scenario with 10ms backhaul latency. These results 
demonstrate the superior performance and the robustness of 
the proposed optimization based subspace constrained precod¬ 
ing w.r.t. signaling latency in backhaul. 


Figure 5: Outage throughput comparisons of different schemes. The 
number of antennas is M = 64 and the other simulation setup is the 
same as the basic simulation setup. 


The other simulation setup is the same as the basic simulation 
setup. In this case, the users in each cell concentrate in the 
two hotspots. Compared to Fig. [4} the performance of both the 
proposed scheme and baseline 3 are improved. This shows that 
the proposed scheme and baseline 3 favor the scenario when 
the users in each cell concentrate in a few user clusters. 

In Fig. [7] we consider heterogeneous QoS requirements, 
where cr is chosen uniformly between [0.05,0.15]. The other 
simulation setup is the same as the basic simulation setup. 
Compared to Fig. [4] with homogeneous QoS requirement, the 
performance gain of the proposed scheme over baseline 1 and 



D. Impact of Inner Percoders and CSI Errors 

In this paper, we consider ZF inner precoding because 1) the 
ZF inner precoding is more tractable; and 2) it has been shown 
in |T| that ZF precoding is close to optimal for massive MIMO 
systems with perfect CSIT. To study the impact of different 
inner precoders, we compare the performance of the ZF inner 
precoding with the RZF inner precoding under different outer 
precoders F, where the outer precoder refers to either the 
proposed optimization based Tx subspace control matrix or the 
pre-beamforming matrix in baseline 3. In RZF inner precoder, 
the regularization factor is fixed to as in |8|, where So is 
the total dimension of the outer precoders associated with the 
user clusters in the reference cell. To study the impact of CSI 
errors, the CSIT at each BS is modeled as follows: 

hfc,z = h — ek,i, 

where the CSI error ek,i is assumed to be a complex Gaussian 
vector with zero mean and covariance matrix afl, and al 
denotes the error variance. In Fig. [8] we plot the outage 
throughput of different schemes versus the error variance 
under the basic simulation setup. It can be seen that the 
performance of ZF inner precoding under the same outer 
precoder is similar to that of RZF inner precoding. The 
performance of all schemes degrades at a similar rate as the 
CSI error increases. In all cases, the proposed optimization 
based outer precoder outperforms the BD-based out percoder 
in baseline 3. 


Figure 6: Outage throughput comparisons of different schemes. The 
number of users in each cell is Ko = 8 and the number of users 
per cluster is K c = 4. The other simulation setup is the same as the 
basic simulation setup. 


E. Implementation Cost 

Table [HI] compares the implementation cost for different 
schemes in terms of the computational complexity (CPU time), 
the (real-time and statistical) CSI signaling overhead, and 
































































Appendix 

A. Proof of Theorem [j] 

Note that I k = ^j n 0iS,F n G n P n GtFt0^z fcjB 

where z u = —fr.?-,.. i £ C M has i.i.d. complex entries 
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Figure 8: An illustration of the impact of inner precoding and CSI 
errors under the basic simulation setup. The abbreviation “OP” stands 
for outer precoding and “IP” stands for inner precoding. 
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where EL = 
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CPU 

time 

Backhaul 
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Real-time 

CSI 

signaling 

Statistical 

CSI 

signaling 

Proposed 

0.13 s 

0.13 C 4U 

9 PS. 7 C 9 

0.8 PS, 0.12 C 40 

FFR 

0.06 s 

0 

40 PS, 7 C 40 

0 

Ideal CoMP 

1.29 s 

16 C 40 

40 PS, 7 C 120 
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JSDM-PGP 

0.04 s 

0.13 C 40 

9 PS, 7 C 9 

0.8 PS, 0.12 C 40 


—H Ht 
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Table III: Comparison of the per time slot MATLAB computational 
time and per time slot per cell signaling overhead of different 
schemes under the basic simulation setup. For the proposed scheme 
and Baseline 1, the statistical CSI © is obtained using the method 
described in Section |V-A| with T v = 20. The meaning of the CSI 


where A r ,.. max and A„ lnln are respectively the largest and the 
smallest eigenvalue of jyH n HT. It follows from Assumption 

0 that limsup || jjHnHjjll < oo with probability 1 |15|, 

M—>oo 

£ Cl^l xM . Since A n , max = 

max ' s uniformly bounded 

with probability 1. On the other hand, there exists Ao > 0 
such that, we have A„ im i n > Ao uniformly on M; otherwise, 
it will contradict with the assumption that F is feasible. It 
follows from the above analysis that ||G rl P n Gj t || is uniformly 
bounded on M with probability 1. Together with Assumption 
El 11 ©^y 2 FnGnPnG^F^Qy 2 I is also uniformly bounded 
on M with probability 1. Then by G3 Lemma 4], we have 
Ik — Ik 0, where 
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Tr(©^F„G„P„GlFt©^) 


signaling overhead notation in the table is also explained in Section 

V-A 

< ]T Tr(G„P„Gj l )Tr( 




= h 


from which Theorem [3] follows immediately. 


the backhaul signaling overhead caused by the exchange of 
(real-time or statistical) CSI between the central node and 
BSs. It can be seen that the overall implementation cost of 
the proposed scheme and baseline 3 is the lowest among all 
schemes. 


VII. Conclusion 

We propose a two-stage subspace constrained precoding 
scheme for massive MIMO cellular systems. The MIMO 
precoder is partitioned into inner precoder (for intra-cluster 
spatial multiplexing gain) and Tx subspace control matrix 
(for inter-cluster interference control). We formulate the Tx 
subspace control as a Quality of Service (QoS) optimization 
problem and propose a novel bi-convex approximation ap¬ 
proach to produce efficiently computable bi-convex approxi¬ 
mation of this QoS optimization problem. Analysis shows that 
the proposed solution is robust to backhaul signaling latency 
and can significantly reduce the CSI signaling overhead. 
Moreover, simulations verify that the proposed design achieves 
significant performance gain compared with various state-of- 
the-art baselines under various backhaul signaling latency. 


B. Proof of Theorem [7] 
Let = ht. 
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and consider the singular value decomposition (SVD): 
H_ fc = UDVt, where U £ u(l^l -1 ) x (l w "fcl -1 ), D £ 
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Gaussian vector. Applying the Bernstein-type inequality based 
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where Q = -h— Q. It is easy to see that all the eigenvalues 

of Q must be no more than 1. As a result, we have l|QL< 
Tr (Q) and thus (22 1 can be satisfied if 

Tr(Q) - Q) > ( 
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It can be shown that Tr (Q) > V "f , A,, where A, is the 
j-th largest eigenvalue of F^ ©° F ()fc . Using the fact that 
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Note that (|7j» is obtained from (p4| by replacing Tr (Q ) w ith 


Tr © 
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1. Hence 


.14 


ensures that (24) is 


satisfied. Clearly, the above relationsh irTne tween ( 2l~|24| ~ju)d 
I 7 J holds for any H As a result, (|7ji also ensures that the 
unconditional version of (|2 T|i in ([6]) is satisfied. 


C. Proof of Theorem [6] 

It is easy to see that W* must also be the optimal solution 
of V with fixed 7*, which is a convex problem w.r.t. W*. 
According to the analysis in Section IV-D1 there exist La¬ 
grange multipliers /x* and u* such that W* maximizes the 
corresponding Lagrange function L(/x*,i/*,W) in (15 1 . The 
optimal solution W* of the maximization problem in ( | 15| ) with 
(/r, is) = (/x*, is*) is given by (18 1 with (/x, is) in A„ equal to 


(/x*, is*). From ( flSj i, it is easy to see that W* — W* 2 = 0, Vn, 
from which Theorem [6] follows immediately. 
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